126 research outputs found

    Developing a distributed electronic health-record store for India

    Get PDF
    The DIGHT project is addressing the problem of building a scalable and highly available information store for the Electronic Health Records (EHRs) of the over one billion citizens of India

    Shuffling with a Croupier: Nat-Aware Peer-Sampling

    Get PDF
    Despite much recent research on peer-to-peer (P2P) protocols for the Internet, there have been relatively few practical protocols designed to explicitly account for Network Address Translation gateways (NATs). Those P2P protocols that do handle NATs circumvent them using relaying and hole-punching techniques to route packets to nodes residing behind NATs. In this paper, we present Croupier, a peer sampling service (PSS) that provides uniform random samples of nodes in the presence of NATs in the network. It is the first NAT-aware PSS that works without the use of relaying or hole-punching. By removing the need for relaying and hole-punching, we decrease the complexity and overhead of our protocol as well as increase its robustness to churn and failure. We evaluated Croupier in simulation, and, in comparison with existing NAT-aware PSS’, our results show similar randomness properties, but improved robustness in the presence of both high percentages of nodes behind NATs and massive node failures. Croupier also has substantially lower protocol overhead

    Gozar: NAT-friendly Peer Sampling with One-Hop Distributed NAT Traversal

    Get PDF
    Gossip-based peer sampling protocols have been widely used as a building block for many large-scale distributed applications. However, Network Address Translation gateways (NATs) cause most existing gossiping protocols to break down, as nodes cannot establish direct connections to nodes behind NATs (private nodes). In addition, most of the existing NAT traversal algorithms for establishing connectivity to private nodes rely on third party servers running at a well-known, public IP addresses. In this paper, we present Gozar, a gossip-based peer sampling service that: (i) provides uniform random samples in the presence of NATs, and (ii) enables direct connectivity to sampled nodes using a fully distributed NAT traversal service, where connection messages require only a single hop to connect to private nodes. We show in simulation that Gozar preserves the randomness properties of a gossip-based peer sampling service. We show the robustness of Gozar when a large fraction of nodes reside behind NATs and also in catastrophic failure scenarios. For example, if 80% of nodes are behind NATs, and 80% of the nodes fail, more than 92% of the remaining nodes stay connected. In addition, we compare Gozar with existing NAT-friendly gossip-based peer sampling services, Nylon and ARRG. We show that Gozar is the only system that supports one-hop NAT traversal, and its overhead is roughly half of Nylon’s

    GLive: The Gradient overlay as a market maker for mesh-based P2P live streaming

    Get PDF
    Peer-to-Peer (P2P) live video streaming over the Internet is becoming increasingly popular, but it is still plagued by problems of high playback latency and intermittent playback streams. This paper presents GLive, a distributed market-based solution that builds a mesh overlay for P2P live streaming. The mesh overlay is constructed such that (i) nodes with increasing upload bandwidth are located closer to the media source, and (ii) nodes with similar upload bandwidth become neighbours. We introduce a market-based approach that matches nodes willing and able to share the stream with one another. However, market-based approaches converge slowly on random overlay networks, and we improve the rate of convergence by adapting our market-based algorithm to exploit the clustering of nodes with similar upload bandwidths in our mesh overlay. We address the problem of free-riding through nodes preferentially uploading more of the stream to the best uploaders. We compare GLive with our previous tree-based streaming protocol, Sepidar, and NewCoolstreaming in simulation, and our results show significantly improved playback continuity and playback latency

    Converging an Overlay Network to a Gradient Topology

    Get PDF
    In this paper, we investigate the topology convergence problem for the gossip-based Gradient overlay network. In an overlay network where each node has a local utility value, a Gradient overlay network is characterized by the properties that each node has a set of neighbors with the same utility value (a similar view) and a set of neighbors containing higher utility values (gradient neighbor set), such that paths of increasing utilities emerge in the network topology. The Gradient overlay network is built using gossiping and a preference function that samples from nodes using a uniform random peer sampling service. We analyze it using tools from matrix analysis, and we prove both the necessary and sufficient conditions for convergence to a complete gradient structure, as well as estimating the convergence time and providing bounds on worst-case convergence time. Finally, we show in simulations the potential of the Gradient overlay, by building a more efficient live-streaming peer-to-peer (P2P) system than one built using uniform random peer sampling.Comment: Submitted to 50th IEEE Conference on Decision and Control (CDC 2011

    Cloud-native RStudio on Kubernetes for Hopsworks

    Full text link
    In order to fully benefit from cloud computing, services are designed following the "multi-tenant" architectural model, which is aimed at maximizing resource sharing among users. However, multi-tenancy introduces challenges of security, performance isolation, scaling, and customization. RStudio server is an open-source Integrated Development Environment (IDE) accessible over a web browser for the R programming language. We present the design and implementation of a multi-user distributed system on Hopsworks, a data-intensive AI platform, following the multi-tenant model that provides RStudio as Software as a Service (SaaS). We use the most popular cloud-native technologies: Docker and Kubernetes, to solve the problems of performance isolation, security, and scaling that are present in a multi-tenant environment. We further enable secure data sharing in RStudio server instances to provide data privacy and allow collaboration among RStudio users. We integrate our system with Apache Spark, which can scale and handle Big Data processing workloads. Also, we provide a UI where users can provide custom configurations and have full control of their own RStudio server instances. Our system was tested on a Google Cloud Platform cluster with four worker nodes, each with 30GB of RAM allocated to them. The tests on this cluster showed that 44 RStudio servers, each with 2GB of RAM, can be run concurrently. Our system can scale out to potentially support hundreds of concurrently running RStudio servers by adding more resources (CPUs and RAM) to the cluster or system.Comment: 8 pages, 4 figure

    Digital Atlas of American Religion

    Get PDF
    poster abstractOur poster presentation will introduce DAAR, the Digital Atlas of American Religion (http://www.religionatlas.org). DAAR is a web-based research platform with innovative data exploration and visualization tools to support research in the humanities. Time and location are essential components of humanities exploratory research; however, GIS technology, especially in its web form, does not support the easy exploration and visualization of the complex spatio-temporal data manipulated by humanists. DAAR presents researchers with an integrated solution stemming from several fields including GIS, visualization, and classification theory. Researchers using DAAR are provided with the following exploration/visualization techniques: maps, cartograms, tree maps, pie charts, and motion charts. Using these tools and methods, researchers can explore patterns, trends, and relationships in the data that otherwise are not apparent with traditional GIS or statistical software. DAAR allows researchers to understand the multiple dimensions and diversity of religion across geographies, or within geographies. Paired with historic census data, it allows them to explore relationships to give better context and meaning to the patterns and trends. Maps provide the spatial patterns and relationships, tree maps show relative strength and relationships, charts show trends, cartograms reveal relative numbers of adherence, and motion charts animate trends over time
    corecore